Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement request timeout middleware #46115

Merged
merged 11 commits into from
Mar 16, 2023

Conversation

Kahbazi
Copy link
Member

@Kahbazi Kahbazi commented Jan 15, 2023

Implement request timeout middleware

Implement a middleware which links a cancellation token to HttpContext.RequestAborted in order to handle request timeout.

Description

The middleware looks up the endpoint metadata and get the timespan from it. The Order is RequestTimeoutPolicy, RequestTimeoutAttribute and if none of them are available, it will check the RequestTimeoutOptions.DefaultTimeout. If the DisableRequestTimeoutAttribute metadata is available then the middleware would be skipped.

Questions:

  • I didn't quite understand the need of RequestTimeoutPolicy as a metadata. Why not just add a constructor that takes Timespan to RequestTimeoutAttribute and avoid looking up for both metadata?
  • Should the CancellationTokenSources be reused? Something like ActivityCancellationTokenSource.
  • How should I test this? My understanding is that we should not use Task.Delay in tests, but I see no other way to test this. Should I just use that?

Fixes #45732

@ghost ghost added area-runtime community-contribution Indicates that the PR has been added by a community member labels Jan 15, 2023
@ghost
Copy link

ghost commented Jan 15, 2023

Thanks for your PR, @Kahbazi. Someone from the team will get assigned to your PR shortly and we'll get it reviewed.

Copy link
Member

@JamesNK JamesNK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically like a gRPC deadline on the server. You can see the main class for implementing that here: https://github.com/grpc/grpc-dotnet/blob/948be08b088fbf449982b731c8e1a7cf8abb1965/src/Grpc.AspNetCore.Server/Internal/ServerCallDeadlineManager.cs

gRPC implementation is a lot more complicated than needed because I wanted to avoid allocating a CancellationTokenSource unless requested. Also, it supports timeouts greater than the duration of CancellationTokenSource.

Edit: I read the original issue and see a significant difference between this and gRPC deadlines. Unlike a deadline, this is cooperative and doesn't cancel the original request.

src/Http/Http/src/Timeouts/RequestTimeoutsMiddleware.cs Outdated Show resolved Hide resolved
{
var originalToken = context.RequestAborted;
var timeoutCts = new CancellationTokenSource(timespan.Value);
var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(context.RequestAborted, timeoutCts.Token);
Copy link
Member

@JamesNK JamesNK Jan 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a linked token source is quite expensive. There are now 3 token sources for the request.

  1. The original RequestAborted CTS
  2. The timeout CTS
  3. The linked CTS

You can get that down to two CTS. Replace the linked CTS with subscribing to the original CTS's token, and call timeoutCts.Cancel() in the callback. Gotchas are you have to add thread safety between timeoutCts.Cancel() and timeoutCts.Dispose() (canceling a disposed CTS errors), and unregister the subscription.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to get this down to 0 (amortized). We need to use an ObjectPool<CancellationTokenSource> or (https://github.com/dotnet/aspnetcore/blob/4535ea1263e9a24ca8d37b7266797fe1563b8b12/src/Shared/CancellationTokenSourcePool.cs). This combined with RegisterForDispose should get the allocations per request down (removes the async state machine).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm tempted to replace this custom pool with the ObjectPool since we just improved the performance.

cc @BrennanConroy

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use IPersistentState to store the CTS on the HttpContext. It will be cached along with the context.

That avoids any locking and contention from a shared pool.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. We can try that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JamesNK I'm still trying to figure out your solution. Are you suggesting to store the Original CTS from here in IPersistentState

_abortedCts = new CancellationTokenSource();

and then call CancelAfter(timespan) ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tratcher Any guidance on this thread? I would like to do this one in this PR if I can.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get the benchmarks working first, then we can measure the impact of this change

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Could you please point me to a benchmark sample and what do I need to benchmark exactly? I'm thinking a pipeline with and without the request timeouts middleware?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/dotnet/aspnetcore/blob/main/src/Middleware/RequestDecompression/perf/Microbenchmarks/RequestDecompressionMiddlewareBenchmark.cs

  • An endpoint with no metadata, no default timeout
  • a default timeout
  • default overridden by disable attribute
  • a timeout set on an endpoint
  • a named policy
  • a very short timeout that fires

@Tratcher Tratcher self-assigned this Jan 17, 2023
@Tratcher
Copy link
Member

  • I didn't quite understand the need of RequestTimeoutPolicy as a metadata. Why not just add a constructor that takes Timespan to RequestTimeoutAttribute and avoid looking up for both metadata?

We didn't want a Timespan constructor for RequestTimeoutAttribute because Timespan's can't be used as attribute input.

@Tratcher
Copy link
Member

Tratcher commented Jan 17, 2023

  • Should the CancellationTokenSources be reused? Something like ActivityCancellationTokenSource.
  • How should I test this? My understanding is that we should not use Task.Delay in tests, but I see no other way to test this. Should I just use that?

These two questions go together. Factor out the token creation to an internal service. That service would be responsible for both recycling tokens, as well as mocking them when needed. The mock implementation can use a manually advancing clock rather than a timer.

@Kahbazi Kahbazi force-pushed the kahbazi/timeoutMiddleware branch from 81d90c5 to dc29a51 Compare January 19, 2023 15:11
Copy link
Member

@Tratcher Tratcher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert src/submodules/googletest

@davidfowl
Copy link
Member

We'll want a performance test for this middleware as well. We need to make it as pay for play as possible. Most requests won't be timing out, so we should optimize for that.

We also need to potentially do something with long running requests cc @BrennanConroy

@JamesNK
Copy link
Member

JamesNK commented Jan 20, 2023

How long is a long-running request? Hours or days?

What I did for gRPC + long timeouts was to have a Timer that reschedules itself until the timeout is zero. However, there is an extra allocation there.

What you can do is have different paths for short timeouts (milliseconds less than 2^32) and long timeouts.

  • Short = optimized = just use a resettable CancellationTokenSource, and cache it
  • Long = Timer allocation = rescheduling timer route.

If a request is long-running then per-request allocations aren't a big concern.

@davidfowl
Copy link
Member

How long is a long-running request? Hours or days?

Something this middleware should ignore. I'm thinking about streaming requests:

  • WebSockets
  • GRPC streams
  • SSE
  • LongPolling
  • SignalR
  • WebTransport

The ability to set a global default timeout devoid of a specific endpoint breaks those scenarios. Should there be a feature to disable the timeout (similar to what we have for response buffering).

@Tratcher
Copy link
Member

The ability to set a global default timeout devoid of a specific endpoint breaks those scenarios. Should there be a feature to disable the timeout (similar to what we have for response buffering).

Timeouts can already be disabled per endpoint via metadata. I understand that can be tedious/redundant, but I hesitate to try to special case all of those scenarios in the middleware. Having an opt-out feature is a bit odd because the timeout already started and may fire before it's disabled.

Proposal:

  • For WebTransport and WebSockets/Upgrade we do have the ability to stop the timer when they are accepted by shimming those features.
  • SignalR can configure/disable timeouts for its own endpoints as needed.
  • Long Polling already has a timeout mechanism, it can opt out of this one via metadata.
  • SSE would likely opt out as well.
  • GRPC is an interesting case because there are both streamed and non-streamed requests and middleware can't tell the difference. @JamesNK when gRPC is generating endpoints does it know which endpoints are for streaming so it can opt out of timeouts for those endpoints?

@Kahbazi Kahbazi force-pushed the kahbazi/timeoutMiddleware branch from 36ae661 to 3d11e63 Compare January 23, 2023 20:15
@Kahbazi Kahbazi force-pushed the kahbazi/timeoutMiddleware branch from 96ac92f to 17dffdd Compare March 9, 2023 21:44
@Kahbazi Kahbazi requested a review from captainsafia as a code owner March 9, 2023 21:44
@adityamandaleeka
Copy link
Member

@Kahbazi I'm pushing a few cleanup updates to this PR so we can get this merged quickly.

@Tratcher Tratcher enabled auto-merge (squash) March 16, 2023 16:19
@Tratcher Tratcher enabled auto-merge (squash) March 16, 2023 16:53
@Tratcher Tratcher merged commit f9cc9e1 into dotnet:main Mar 16, 2023
@ghost ghost added this to the 8.0-preview3 milestone Mar 16, 2023
@adityamandaleeka
Copy link
Member

Looks like this is merged. Thank you for all your work on this @Kahbazi!

@Kahbazi
Copy link
Member Author

Kahbazi commented Mar 16, 2023

Looks like this is merged. Thank you for all your work on this @Kahbazi!

You're welcome. Sorry I couldn't complete this on time. We are approaching new year holidays in here.

@ghost
Copy link

ghost commented Mar 16, 2023

Hi @Kahbazi. It looks like you just commented on a closed PR. The team will most probably miss it. If you'd like to bring something important up to their attention, consider filing a new issue and add enough details to build context.

@adityamandaleeka
Copy link
Member

Looks like this is merged. Thank you for all your work on this @Kahbazi!

You're welcome. Sorry I couldn't complete this on time. We are approaching new year holidays in here.

No worries at all. And happy new year!

@amcasey amcasey added area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions and removed area-runtime labels Jun 6, 2023
@github-actions github-actions bot locked and limited conversation to collaborators Dec 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions blog-candidate Consider mentioning this in the release blog post community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Request timeouts middleware